10 research outputs found

    How to See with an Event Camera

    Get PDF
    Seeing enables us to recognise people and things, detect motion, perceive our 3D environment and more. Light stimulates our eyes, sending electrical impulses to the brain where we form an image and extract useful information. Computer vision aims to endow computers with the ability to interpret and understand visual information - an artificial analogue to human vision. Traditionally, images from a conventional camera are processed by algorithms designed to extract information. Event cameras are bio-inspired sensors that offer improvements over conventional cameras. They (i) are fast, (ii) can see dark and bright at the same time, (iii) have less motion-blur, (iv) use less energy and (v) transmit data efficiently. However, it is difficult for humans and computers alike to make sense of the raw output of event cameras, called events, because events look nothing like conventional images. This thesis presents novel techniques for extracting information from events via: (i) reconstructing images from events then processing the images using conventional computer vision and (ii) processing events directly to obtain desired information. To advance both fronts, a key goal is to develop a sophisticated understanding of event camera output including its noise properties. Chapters 3 and 4 present fast algorithms that process each event upon arrival to continuously reconstruct the latest image and extract information. Chapters 5 and 6 apply machine learning to event cameras, letting the computer learn from a large amount of data how to process event data to reconstruct video and estimate motion. I hope the algorithms presented in this thesis will take us one step closer to building intelligent systems that can see with event cameras

    CED: Color Event Camera Dataset

    Full text link
    Event cameras are novel, bio-inspired visual sensors, whose pixels output asynchronous and independent timestamped spikes at local intensity changes, called 'events'. Event cameras offer advantages over conventional frame-based cameras in terms of latency, high dynamic range (HDR) and temporal resolution. Until recently, event cameras have been limited to outputting events in the intensity channel, however, recent advances have resulted in the development of color event cameras, such as the Color-DAVIS346. In this work, we present and release the first Color Event Camera Dataset (CED), containing 50 minutes of footage with both color frames and events. CED features a wide variety of indoor and outdoor scenes, which we hope will help drive forward event-based vision research. We also present an extension of the event camera simulator ESIM that enables simulation of color events. Finally, we present an evaluation of three state-of-the-art image reconstruction methods that can be used to convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to visualise the event stream, and for use in downstream vision applications.Comment: Conference on Computer Vision and Pattern Recognition Workshop

    An Asynchronous Kalman Filter for Hybrid Event Cameras

    Full text link
    Event cameras are ideally suited to capture HDR visual information without blur but perform poorly on static or slowly changing scenes. Conversely, conventional image sensors measure absolute intensity of slowly changing scenes effectively but do poorly on high dynamic range or quickly changing scenes. In this paper, we present an event-based video reconstruction pipeline for High Dynamic Range (HDR) scenarios. The proposed algorithm includes a frame augmentation pre-processing step that deblurs and temporally interpolates frame data using events. The augmented frame and event data are then fused using a novel asynchronous Kalman filter under a unifying uncertainty model for both sensors. Our experimental results are evaluated on both publicly available datasets with challenging lighting conditions and fast motions and our new dataset with HDR reference. The proposed algorithm outperforms state-of-the-art methods in both absolute intensity error (48% reduction) and image similarity indexes (average 11% improvement).Comment: 12 pages, 6 figures, published in International Conference on Computer Vision (ICCV) 202

    An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras

    Full text link
    Event cameras are ideally suited to capture High Dynamic Range (HDR) visual information without blur but provide poor imaging capability for static or slowly varying scenes. Conversely, conventional image sensors measure absolute intensity of slowly changing scenes effectively but do poorly on HDR or quickly changing scenes. In this paper, we present an asynchronous linear filter architecture, fusing event and frame camera data, for HDR video reconstruction and spatial convolution that exploits the advantages of both sensor modalities. The key idea is the introduction of a state that directly encodes the integrated or convolved image information and that is updated asynchronously as each event or each frame arrives from the camera. The state can be read-off as-often-as and whenever required to feed into subsequent vision modules for real-time robotic systems. Our experimental results are evaluated on both publicly available datasets with challenging lighting conditions and fast motions, along with a new dataset with HDR reference that we provide. The proposed AKF pipeline outperforms other state-of-the-art methods in both absolute intensity error (69.4% reduction) and image similarity indexes (average 35.5% improvement). We also demonstrate the integration of image convolution with linear spatial kernels Gaussian, Sobel, and Laplacian as an application of our architecture.Comment: 17 pages, 10 figures, Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) in August 202

    Asynchronous Tracking-by-Detection on Adaptive Time Surfaces for Event-based Object Tracking

    Full text link
    Event cameras, which are asynchronous bio-inspired vision sensors, have shown great potential in a variety of situations, such as fast motion and low illumination scenes. However, most of the event-based object tracking methods are designed for scenarios with untextured objects and uncluttered backgrounds. There are few event-based object tracking methods that support bounding box-based object tracking. The main idea behind this work is to propose an asynchronous Event-based Tracking-by-Detection (ETD) method for generic bounding box-based object tracking. To achieve this goal, we present an Adaptive Time-Surface with Linear Time Decay (ATSLTD) event-to-frame conversion algorithm, which asynchronously and effectively warps the spatio-temporal information of asynchronous retinal events to a sequence of ATSLTD frames with clear object contours. We feed the sequence of ATSLTD frames to the proposed ETD method to perform accurate and efficient object tracking, which leverages the high temporal resolution property of event cameras. We compare the proposed ETD method with seven popular object tracking methods, that are based on conventional cameras or event cameras, and two variants of ETD. The experimental results show the superiority of the proposed ETD method in handling various challenging environments.Comment: 9 pages, 5 figure

    Asynchronous Spatial Image Convolutions for Event Cameras

    No full text
    Spatial convolution is arguably the most fundamental of two-dimensional image processing operations. Conventional spatial image convolution can only be applied to a conventional image, that is, an array of pixel values (or similar image representation) that are associated with a single instant in time. Event cameras have serial, asynchronous output with no natural notion of an image frame, and each event arrives with a different timestamp. In this letter, we propose a method to compute the convolution of a linear spatial kernel with the output of an event camera. The approach operates on the event stream output of the camera directly without synthesising pseudoimage frames as is common in the literature. The key idea is the introduction of an internal state that directly encodes the convolved image information, which is updated asynchronously as each event arrives from the camera. The state can be read off as often as and whenever required for use in higher level vision algorithms for real-time robotic systems. We demonstrate the application of our method to corner detection, providing an implementation of a Harris corner-response “state” that can be used in real time for feature detection and tracking on robotic systems.This work was supported in part by the Australian Government Research Training Program Scholarship and in part by the Australian Research Council through the “Australian Centre of Excellence for Robotic Vision” under Grant CE140100016

    Fast Image Reconstruction with an Event Camera

    Full text link

    CED: Color Event Camera Dataset

    No full text
    Event cameras are novel, bio-inspired visual sensors, whose pixels output asynchronous and independent times-tamped spikes at local intensity changes, called ‘events’. Event cameras offer advantages over conventional frame-based cameras in terms of latency, high dynamic range(HDR) and temporal resolution. Until recently, event cameras have been limited to outputting events in the intensity channel, however, recent advances have resulted in the development of color event cameras, such as the Color-DAVIS346. In this work, we present and release the first Color Event Camera Dataset (CED), containing 50 minutes of footage with both color frames and events. CED features a wide variety of indoor and outdoor scenes, which we hope will help drive forward event-based vision research.We also present an extension of the event camera simulator ESIM [1] that enables simulation of color events. Finally,we present an evaluation of three state-of-the-art image re-construction methods that can be used to convert the Color-DAVIS346 into a continuous-time, HDR, color video cam-era to visualise the event stream, and for use in downstream vision applications
    corecore